💡 Visualizing Eindhoven Public Lights¶

📘 Introduction¶

This notebook explores the Eindhoven Public Lights dataset, which contains detailed information about the city's public lighting infrastructure. Each entry includes the geographic location of a light fixture, its placement and planned maintenance dates, type, power, and color.

🎯 Goals¶

The aim of this data visualization project is to:

  • Map the locations of public lights across Eindhoven to understand distribution and density.
  • Simulate nighttime lighting by visualizing the color and intensity of each light fixture on an interactive map.
  • Support maintenance planning by highlighting overdue or upcoming maintenance tasks and offering an overview for decision-making.

By combining spatial, temporal, and categorical data, we seek to generate insights valuable for city planners, maintenance engineers, and even residents curious about how their city is lit and maintained.

🛠️ Tools and Technologies¶

To explore and visualize this dataset, the following tools will be used:

  • Python for data handling and scripting
  • Pandas for loading and preprocessing the dataset
  • Plotly and Folium for interactive mapping and geospatial visualizations
  • Matplotlib or Seaborn for supplementary charts
  • Datetime for time-based analysis
  • Jupyter Notebook as an interactive environment for combining code, visuals, and narrative
In [43]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import ast 
import re
import folium
import datetime
import plotly.express as px
import ipywidgets as widgets
from IPython.display import display
from folium.plugins import MarkerCluster
from folium.plugins import HeatMap
from geopy.distance import geodesic

📂 Data Loading & Preprocessing¶

To begin the analysis, we load the Eindhoven Public Lights dataset and inspect its structure to understand what columns are available and how the data is formatted.

Load The Dataset¶

In [ ]:
file_path = 'public_lights_Eindhoven.csv'
df = pd.read_csv(file_path, delimiter=';', on_bad_lines='skip')

df.head()
Out[ ]:
OBJECTID ZONE DISTRICT NEIGHBORHOOD DATE_PLACEMENT DATE_MAINTENENCE TYPE COLOR WATTAGE LUMEN GEO_SHAPE
0 3803829 3 Tongelre 33 Oud-Tongelre 336 Karpen 2020-10-07 2025-10-08 son geel 100 9700 {"coordinates": [5.510598756938196, 51.4538568...
1 3803831 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2016-04-22 2021-04-20 son geel 100 9700 {"coordinates": [5.506433769949912, 51.4570174...
2 3803833 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2023-08-18 2028-08-24 son geel 100 9700 {"coordinates": [5.511550782327051, 51.4528870...
3 3803836 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2021-03-16 2026-07-29 sont geel 100 10700 {"coordinates": [5.511833410292869, 51.4524190...
4 3803838 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2023-08-18 2028-08-24 son geel 100 9700 {"coordinates": [5.512105789367028, 51.4522177...

Extract Coordinates from GEO_SHAPE¶

The GEO_SHAPE column contains coordinates in a JSON-like string. Let’s extract longitude and latitude.

In [7]:
def extract_coords(geo_str):
    try:
        matches = re.findall(r"[-+]?\d*\.\d+|\d+", geo_str)
        if len(matches) >= 2:
            lon, lat = float(matches[0]), float(matches[1])
            return pd.Series([lat, lon])
    except:
        return pd.Series([None, None])

df[['latitude', 'longitude']] = df['GEO_SHAPE'].apply(extract_coords)

Clean and Convert Date Columns¶

In [8]:
df['DATE_PLACEMENT'] = pd.to_datetime(df['DATE_PLACEMENT'], errors='coerce')
df['DATE_MAINTENENCE'] = pd.to_datetime(df['DATE_MAINTENENCE'], errors='coerce')

Final Check¶

In [ ]:
df.info()

df.isnull().sum()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 56581 entries, 0 to 56580
Data columns (total 13 columns):
 #   Column            Non-Null Count  Dtype         
---  ------            --------------  -----         
 0   OBJECTID          56581 non-null  int64         
 1   ZONE              56581 non-null  object        
 2   DISTRICT          56581 non-null  object        
 3   NEIGHBORHOOD      56581 non-null  object        
 4   DATE_PLACEMENT    56580 non-null  datetime64[ns]
 5   DATE_MAINTENENCE  56320 non-null  datetime64[ns]
 6   TYPE              56163 non-null  object        
 7   COLOR             44565 non-null  object        
 8   WATTAGE           56581 non-null  int64         
 9   LUMEN             56581 non-null  int64         
 10  GEO_SHAPE         56581 non-null  object        
 11  latitude          56581 non-null  float64       
 12  longitude         56581 non-null  float64       
dtypes: datetime64[ns](2), float64(2), int64(3), object(6)
memory usage: 5.6+ MB
Out[ ]:
OBJECTID                0
ZONE                    0
DISTRICT                0
NEIGHBORHOOD            0
DATE_PLACEMENT          1
DATE_MAINTENENCE      261
TYPE                  418
COLOR               12016
WATTAGE                 0
LUMEN                   0
GEO_SHAPE               0
latitude                0
longitude               0
dtype: int64
In [ ]:
df.dropna(subset=['latitude', 'longitude'], inplace=True)

df.head()
Out[ ]:
OBJECTID ZONE DISTRICT NEIGHBORHOOD DATE_PLACEMENT DATE_MAINTENENCE TYPE COLOR WATTAGE LUMEN GEO_SHAPE latitude longitude
0 3803829 3 Tongelre 33 Oud-Tongelre 336 Karpen 2020-10-07 2025-10-08 son geel 100 9700 {"coordinates": [5.510598756938196, 51.4538568... 51.453857 5.510599
1 3803831 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2016-04-22 2021-04-20 son geel 100 9700 {"coordinates": [5.506433769949912, 51.4570174... 51.457017 5.506434
2 3803833 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2023-08-18 2028-08-24 son geel 100 9700 {"coordinates": [5.511550782327051, 51.4528870... 51.452887 5.511551
3 3803836 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2021-03-16 2026-07-29 sont geel 100 10700 {"coordinates": [5.511833410292869, 51.4524190... 51.452419 5.511833
4 3803838 3 Tongelre 33 Oud-Tongelre 337 Koudenhoven 2023-08-18 2028-08-24 son geel 100 9700 {"coordinates": [5.512105789367028, 51.4522177... 51.452218 5.512106

📊 Exploratory Data Analysis¶

1. Summary Statistics and Distributions¶

The next output shows the 10 most common light types used in Eindhoven’s public lighting system. LED and PLL lights are by far the most frequently used, suggesting a strong shift toward energy-efficient technologies.

In [ ]:
df['TYPE'].value_counts().head(10)
Out[ ]:
TYPE
led      20752
pll      18546
son       5661
sont      3813
plt       1877
tle       1617
cdm-t      905
tllll      764
cpo-t      486
tld        344
Name: count, dtype: int64

Next, we display the most common light colors in the dataset. White lights dominate the network, followed by yellow. Other colors like orange, blue, green, and red are used far less frequently, likely for special purposes or decorative areas.

In [ ]:
df['COLOR'].value_counts().head(10)
Out[ ]:
COLOR
wit       34249
geel       9474
oranje      439
blauw       302
groen        65
Rood         36
Name: count, dtype: int64

Now, we see the distribution of light wattage across all fixtures. Most lights have low to moderate wattage (median = 36W), indicating energy-efficient usage. However, a few high-wattage outliers (up to 1914W) suggest powerful fixtures in specific locations.

In [ ]:
df['WATTAGE'].describe()
Out[ ]:
count    56581.000000
mean        56.998321
std         64.969117
min          0.000000
25%         32.000000
50%         36.000000
75%         64.000000
max       1914.000000
Name: WATTAGE, dtype: float64

2. Basic Map: Public Lights by Location¶

The interactive map below shows a random sample of 1,000 public lights in Eindhoven, with markers placed at their geographic locations. Lights are color-coded by their actual light color—yellow for “geel” (yellow) and white for all others—offering a quick visual overview of color distribution across the city.

In [42]:
eindhoven_center = [df['latitude'].mean(), df['longitude'].mean()]
m = folium.Map(location=eindhoven_center, zoom_start=13)

sample_df = df.sample(1000)

for _, row in sample_df.iterrows():
    folium.CircleMarker(
        location=[row['latitude'], row['longitude']],
        radius=2,
        color='yellow' if row['COLOR'] == 'geel' else 'white',
        fill=True,
        fill_opacity=0.7
    ).add_to(m)

m
Out[42]:
Make this Notebook Trusted to load map: File -> Trust Notebook

3. Lights Over Time¶

The bar chart shows the number of public lights installed each year. Installations increased significantly after 2010, peaking between 2018 and 2021. This trend suggests a major infrastructure update, likely involving the adoption of newer, energy-efficient lighting systems.

In [15]:
df['DATE_PLACEMENT'].dt.year.value_counts().sort_index().plot(kind='bar', title='Number of Lights Installed per Year')
Out[15]:
<Axes: title={'center': 'Number of Lights Installed per Year'}, xlabel='DATE_PLACEMENT'>
No description has been provided for this image

4. Maintenance Timeline¶

Now, we see the maintenance urgency: 19,015 lights are scheduled for maintenance within the next 30 days. This large number indicates a significant workload for maintenance teams and emphasizes the importance of prioritization and route planning.

In [ ]:
today = pd.Timestamp.today()
upcoming = df[df['DATE_MAINTENENCE'] < today + pd.Timedelta(days=30)]

print(f"⚠️ Lights needing maintenance in the next 30 days: {len(upcoming)}")
⚠️ Lights needing maintenance in the next 30 days: 19015

5. Missing Data Overview¶

The heatmap visualizes missing values across the dataset. Most columns are complete, but a significant number of entries are missing data in the TYPE and COLOR fields, and to a lesser extent in DATE_MAINTENENCE. These gaps may affect certain visualizations or analyses and should be handled accordingly.

In [ ]:
plt.figure(figsize=(10, 6))
sns.heatmap(df.isnull(), cbar=False, cmap='viridis')
plt.title("Missing Data Heatmap")
plt.show()
No description has been provided for this image

6. Top Neighborhoods with the Most Lights¶

The horizontal bar chart shows the 10 neighborhoods with the highest number of public lights. Grasrijk, Binnenstad, and Blixembosch-Oost lead the list, indicating areas with either higher population density, larger surface area, or greater infrastructure investment.

In [18]:
df['NEIGHBORHOOD'].value_counts().head(10).plot(kind='barh', title='Top 10 Neighborhoods by Number of Lights')
plt.xlabel('Number of Lights')
plt.ylabel('Neighborhood')
plt.show()
No description has been provided for this image

7. Power vs. Lumen Scatter Plot¶

The scatter plot visualizes the relationship between wattage and brightness (lumen) for public lights. While higher wattage generally correlates with higher lumen output, the spread and clusters suggest the use of various lighting technologies with differing efficiency levels. Outliers with extremely high values likely represent special-purpose lights.

In [19]:
plt.figure(figsize=(8,6))
sns.scatterplot(data=df, x='WATTAGE', y='LUMEN', alpha=0.3)
plt.title("Wattage vs Lumen")
plt.xlabel("Wattage")
plt.ylabel("Lumen")
plt.show()
No description has been provided for this image

8. Color Usage Distribution¶

The next bar chart shows the distribution of the six most common light colors in the dataset. White lights are overwhelmingly dominant, followed by yellow. Other colors like orange, blue, green, and red are used far less, likely for aesthetic or specialized purposes.

In [20]:
df['COLOR'].value_counts().head(6).plot(kind='bar', title='Most Common Light Colors')
plt.ylabel('Count')
plt.show()
No description has been provided for this image

9. Outlier Detection¶

The boxplot highlights the distribution and outliers for both wattage and lumen values. Most lights have relatively low wattage and lumen output, while a number of extreme outliers—especially for lumen—indicate the presence of very bright, high-powered lights likely used in large or specialized areas.

In [21]:
df[['WATTAGE', 'LUMEN']].boxplot()
plt.title('Boxplot of Wattage and Lumen')
plt.show()
No description has been provided for this image

EDA Summary¶

  • Most lights were installed after 2010, with peaks in specific years.
  • The most common light type is white, and the most frequent color is yellow.
  • Wattage and lumen values are tightly clustered, with a few higher outliers.
  • Some entries are missing COLOR or TYPE, which will be handled or ignored in later visualizations.

📊 Visualizations¶

Interactive Map of Lights (Colored by Light Type)¶

This map displays 1,000 sampled light locations, color-coded by their actual light color (yellow or white). It provides a geographic overview of color usage, which can inform urban design or highlight areas that may benefit from color adjustments for visibility or ambiance.

In [23]:
eindhoven_center = [df['latitude'].mean(), df['longitude'].mean()]
m = folium.Map(location=eindhoven_center, zoom_start=13)

sample = df.dropna(subset=['latitude', 'longitude', 'TYPE']).sample(1000)

type_colors = {
    'son': 'orange',
    'sont': 'blue',
    'led': 'green',
    'halogeen': 'purple',
    'tl': 'red'
}

for _, row in sample.iterrows():
    light_type = row['TYPE'].lower() if isinstance(row['TYPE'], str) else 'unknown'
    color = type_colors.get(light_type, 'gray')
    tooltip = f"Type: {row['TYPE']}<br>Color: {row['COLOR']}<br>Wattage: {row['WATTAGE']}W<br>Installed: {row['DATE_PLACEMENT'].date()}"
    
    folium.CircleMarker(
        location=[row['latitude'], row['longitude']],
        radius=3,
        color=color,
        fill=True,
        fill_opacity=0.6,
        tooltip=tooltip
    ).add_to(m)

m
Out[23]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Animated Light Placement Over Time (Plotly Express)¶

The animated map shows how public lights have been placed across Eindhoven from 2000 onward, color-coded by light type. It allows us to observe rollout trends over time, identify when specific technologies (like LED) became widespread, and explore temporal patterns in urban lighting development. This is especially useful for urban planners and policy analysts.

In [ ]:
anim_df = df.copy()
anim_df['year'] = anim_df['DATE_PLACEMENT'].dt.year
anim_df = anim_df.dropna(subset=['year', 'latitude', 'longitude'])
anim_df = anim_df[anim_df['year'] >= 2000].sample(2000)

fig = px.scatter_mapbox(anim_df,
                        lat='latitude',
                        lon='longitude',
                        color='TYPE',
                        animation_frame='year',
                        mapbox_style='carto-positron',
                        zoom=11,
                        height=600,
                        hover_name='NEIGHBORHOOD',
                        title='Animated Placement of Lights Over Time (2000–Present)')

fig.show()
C:\Users\anika\AppData\Local\Temp\ipykernel_3528\1519311549.py:8: DeprecationWarning: *scatter_mapbox* is deprecated! Use *scatter_map* instead. Learn more at: https://plotly.com/python/mapbox-to-maplibre/
  fig = px.scatter_mapbox(anim_df,

Dashboard-style Filtered View (IPyWidgets + Plotly)¶

The interactive dashboard lets users filter public lights by type and zone, then instantly see the results on a map. It's a powerful tool for city engineers or planners to explore specific areas or technologies—for example, inspecting where LEDs are installed in the city center. This view supports targeted maintenance, upgrades, and data-driven decision-making.

In [ ]:
type_dropdown = widgets.Dropdown(
    options=['All'] + sorted(df['TYPE'].dropna().unique()),
    value='All',
    description='Type:'
)

zone_dropdown = widgets.Dropdown(
    options=['All'] + sorted(df['ZONE'].dropna().unique()),
    value='All',
    description='Zone:'
)

def update_plot(light_type, zone):
    filtered = df.copy()
    if light_type != 'All':
        filtered = filtered[filtered['TYPE'] == light_type]
    if zone != 'All':
        filtered = filtered[filtered['ZONE'] == zone]
        
    fig = px.scatter_mapbox(filtered.sample(min(1000, len(filtered))),
                            lat='latitude',
                            lon='longitude',
                            color='COLOR',
                            mapbox_style='open-street-map',
                            zoom=11,
                            title=f"Filtered Lights View ({light_type}, {zone})")
    fig.show()

ui = widgets.VBox([type_dropdown, zone_dropdown])
out = widgets.interactive_output(update_plot, {'light_type': type_dropdown, 'zone': zone_dropdown})

display(ui, out)
VBox(children=(Dropdown(description='Type:', options=('All', 'cdm-r', 'cdm-t', 'cdm-td', 'cdo-et', 'cdo-tt', '…
Output()

Maintenance Priority Map with Marker Clustering (Folium)¶

The clustered map visualizes maintenance urgency for 1,000 public lights, grouped using markers by priority level (e.g. Overdue, Due Soon). Each marker is color-coded, helping maintenance teams quickly spot problem areas and plan their work based on urgency and location. It's a practical tool for field operations and resource scheduling.

In [ ]:
today = pd.Timestamp.today()
df['days_to_maintenance'] = (df['DATE_MAINTENENCE'] - today).dt.days

priority_df = df[df['days_to_maintenance'].notna()].copy()
priority_df['urgency'] = pd.cut(priority_df['days_to_maintenance'], 
                                bins=[-10000, 0, 30, 90, 365, 10000], 
                                labels=["Overdue", "Due Soon", "In 3 Months", "In 1 Year", "Low Priority"])

m = folium.Map(location=eindhoven_center, zoom_start=13)
marker_cluster = MarkerCluster().add_to(m)

color_map = {
    "Overdue": "red",
    "Due Soon": "orange",
    "In 3 Months": "blue",
    "In 1 Year": "green",
    "Low Priority": "gray"
}

for _, row in priority_df.sample(1000).iterrows():
    folium.Marker(
        location=[row['latitude'], row['longitude']],
        icon=folium.Icon(color=color_map.get(row['urgency'], 'black')),
        popup=f"{row['urgency']}: {row['DATE_MAINTENENCE'].date()}"
    ).add_to(marker_cluster)

m
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Neighborhood Comparison Radar Chart¶

The radar chart compares five neighborhoods based on their average wattage, brightness (lumen), and days until maintenance. It helps identify areas with higher energy consumption, brighter lighting, or more urgent maintenance needs—supporting targeted improvements or resource planning on a neighborhood level.

In [ ]:
radar_data = df.groupby('NEIGHBORHOOD').agg({
    'WATTAGE': 'mean',
    'LUMEN': 'mean',
    'days_to_maintenance': 'mean'
}).dropna().reset_index()

radar_sample = radar_data.sample(5)

fig = px.line_polar(radar_sample.melt(id_vars='NEIGHBORHOOD'), 
                    r='value', theta='variable', color='NEIGHBORHOOD', line_close=True,
                    title="Neighborhood Comparison (Mean Wattage, Lumen, Maintenance Urgency)")
fig.show()

Maintenance Load Forecast¶

The line chart tracks how many lights are scheduled for maintenance each month. It helps visualize workload trends over time and supports long-term maintenance planning. The strange spikes and future dates may indicate data quality issues, which should be addressed to ensure accurate forecasting.

In [32]:
monthly_maint = df['DATE_MAINTENENCE'].dropna().dt.to_period('M').value_counts().sort_index()

monthly_maint.plot(kind='line', figsize=(12,5), title='Upcoming Maintenance Load by Month')
plt.xlabel('Month')
plt.ylabel('Number of Lights Due for Maintenance')
plt.grid(True)
plt.show()
No description has been provided for this image

Color Palette Map: Visualizing Light Color in the City¶

The next map creatively visualizes public lights by their actual light color, using custom hex codes to closely mimic their appearance. It offers a visually rich overview of how different colors are distributed across Eindhoven, which can be useful for analyzing lighting design, ambiance, or visibility in specific areas.

In [ ]:
def to_color_hex(name):
    color_dict = {
        'geel': '#ffff00',
        'wit': '#ffffff',
        'warm wit': '#ffdfa3',
        'blauw': '#00aaff',
        'rood': '#ff0000',
        'oranje': '#ffa500'
    }
    return color_dict.get(name.lower(), '#aaaaaa')

m = folium.Map(location=eindhoven_center, zoom_start=13)

for _, row in df.sample(1000).iterrows():
    color = to_color_hex(row['COLOR']) if pd.notna(row['COLOR']) else '#999999'
    folium.CircleMarker(
        location=[row['latitude'], row['longitude']],
        radius=2.5,
        color=color,
        fill=True,
        fill_color=color,
        fill_opacity=0.6
    ).add_to(m)

m
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Maintenance Engineer View – Priority Planner¶

Now, we categorize public lights based on how soon they require maintenance. A large number are either overdue or considered low priority, which helps maintenance teams identify critical issues and balance workloads over time. It's a clear visual tool for setting action priorities.

In [34]:
df['days_to_maintenance'] = (df['DATE_MAINTENENCE'] - pd.Timestamp.today()).dt.days

df['urgency_level'] = pd.cut(df['days_to_maintenance'], 
                             bins=[-10000, 0, 30, 90, 365, 10000], 
                             labels=["Overdue", "Due Soon", "3 Months", "1 Year", "Low Priority"])

priority_summary = df['urgency_level'].value_counts().sort_index()

priority_summary.plot(kind='bar', title='Maintenance Priority Levels', color='crimson')
plt.ylabel("Number of Lights")
plt.xlabel("Urgency Level")
plt.grid(True)
plt.show()
No description has been provided for this image

And now, we see the most urgent maintenance tasks. This can help maintenance engineers to visually identify hotspots and plan their routes efficiently.

In [35]:
urgent_jobs = df[df['urgency_level'].isin(["Overdue", "Due Soon"])]

m = folium.Map(location=eindhoven_center, zoom_start=13)
for _, row in urgent_jobs.sample(min(1000, len(urgent_jobs))).iterrows():
    folium.Marker(
        location=[row['latitude'], row['longitude']],
        icon=folium.Icon(color='red' if row['urgency_level'] == "Overdue" else 'orange'),
        popup=f"{row['urgency_level']} - Due: {row['DATE_MAINTENENCE'].date()}"
    ).add_to(m)
m
Out[35]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Animated Placement Maintenance Timeline¶

The animated map combines both light placement and planned maintenance events over time. By showing when and where each fixture was installed or scheduled for maintenance, it provides a comprehensive temporal overview of Eindhoven’s public lighting activity. This helps in understanding infrastructure growth and maintenance cycles together.

In [ ]:
anim_df = df.copy()
anim_df['placement_year'] = anim_df['DATE_PLACEMENT'].dt.year
anim_df['maint_year'] = anim_df['DATE_MAINTENENCE'].dt.year
anim_df = anim_df.dropna(subset=['latitude', 'longitude', 'placement_year'])

anim_df['event'] = 'Placed'
anim_df2 = anim_df.copy()
anim_df2['placement_year'] = anim_df2['maint_year']
anim_df2['event'] = 'Maintenance'
anim_combined = pd.concat([anim_df, anim_df2])

fig = px.scatter_mapbox(anim_combined.sample(3000),
                        lat='latitude',
                        lon='longitude',
                        color='event',
                        animation_frame='placement_year',
                        mapbox_style='carto-positron',
                        zoom=11,
                        height=600,
                        title='Light Placement and Maintenance Over Time')

fig.show()
C:\Users\anika\AppData\Local\Temp\ipykernel_3528\2175092040.py:14: DeprecationWarning:

*scatter_mapbox* is deprecated! Use *scatter_map* instead. Learn more at: https://plotly.com/python/mapbox-to-maplibre/

Heatmap of Light Density¶

The heatmap visualizes the density of public lights across Eindhoven. Brighter areas indicate clusters of high light concentration, which may correspond to urban centers, busy streets, or public spaces. It provides a fast, intuitive way to assess coverage intensity and identify underlit areas.

In [ ]:
heat_df = df[['latitude', 'longitude']].dropna().sample(5000)

m = folium.Map(location=eindhoven_center, zoom_start=13)
HeatMap(data=heat_df.values, radius=8, blur=15).add_to(m)
m
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

🧭 Simulated Maintenance Route (Nearest-Neighbor Approximation)¶

The next map simulates a maintenance engineer's daily route, visiting the 10 most urgent lights due for maintenance in the next 30 days. Using a nearest-neighbor algorithm, the route is calculated to minimize travel distance by always selecting the next closest fixture.

The markers indicate stop order — the starting point is green, and subsequent stops are marked in blue, connected by a red polyline showing the full path.

This visualization is especially useful for:

  • Planning daily routes efficiently
  • Reducing travel time and costs
  • Helping maintenance teams prioritize based on urgency and location
In [ ]:
urgent_df = df[df['DATE_MAINTENENCE'] < pd.Timestamp.today() + pd.Timedelta(days=30)]
urgent_df = urgent_df.dropna(subset=['latitude', 'longitude'])
urgent_df = urgent_df.nsmallest(10, 'days_to_maintenance').copy()  # top 10 most urgent

def nearest_neighbor_route(points):
    remaining = points.copy()
    route = [remaining.pop(0)]
    while remaining:
        last = route[-1]
        next_point = min(remaining, key=lambda p: geodesic((last[0], last[1]), (p[0], p[1])).meters)
        route.append(next_point)
        remaining.remove(next_point)
    return route

coords = urgent_df[['latitude', 'longitude']].values.tolist()
ordered_route = nearest_neighbor_route(coords)

m = folium.Map(location=eindhoven_center, zoom_start=13)

for i, (lat, lon) in enumerate(ordered_route):
    folium.Marker(
        location=[lat, lon],
        popup=f"Stop {i+1}",
        icon=folium.Icon(color='blue' if i != 0 else 'green')
    ).add_to(m)

folium.PolyLine(ordered_route, color='red', weight=3, opacity=0.7).add_to(m)

m
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

📌 Conclusion¶

In this project, we explored and visualized Eindhoven’s public lighting dataset using a wide range of interactive and analytical techniques. The dataset contained rich information about the type, color, location, installation date, and planned maintenance of over 56,000 light fixtures.

Through our analysis and visualizations, we gained several key insights:

  • White and yellow lights dominate the city’s infrastructure, with LED emerging as the most widely adopted light type, indicating a shift toward energy efficiency.
  • The installation timeline reveals a clear acceleration of modernization efforts after 2010.
  • Our maintenance urgency breakdown identified thousands of overdue fixtures, highlighting the importance of timely scheduling and resource planning.
  • We created multiple interactive maps and dashboards to support urban planners and maintenance teams, including tools for filtering by zone or light type, viewing upcoming workloads, and even simulating an optimized route for urgent maintenance visits.
  • Creative visualizations, such as a color-coded light map and an animated timeline of light placement and maintenance, added an engaging and intuitive layer to understanding the data.

Overall, this project demonstrates how advanced data visualization techniques can transform raw infrastructure data into actionable insights for decision-makers, engineers, and citizens alike. These visual tools not only help improve operational efficiency but also enhance transparency and planning in public services.